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Bahadur  slope  of  the  t-statistic  for  a  contaminated  normal* 


Narasinga  R.  Chaganty1  and  Jayaram  Sethuraman2 
Old  Dominion  University  and  Florida  State  University 

Abstract 

In  this  paper  we  derive  the  Bahadur  slope  of  the  t-statistic  based 
on  a  random  sample  from  contaminated  normal  distribution, 
using  some  results  in  large  deviation  theory.  We  also  present  a 
table  of  Bahadur  slopes  at  various  alternatives  at  several  levels 
of  contamination. 


1.  INTRODUCTION.  To  study  robustness  of  standard  tests  of  location  in  a  normal 
model,  one  generally  studies  their  properties  under  the  Tukey  model  (see  Tukey(1960)) 
of  contaminated  normal  alternatives,  namely,  the  probability  distributions  P^£  q  ^  with 
probability  density  function  (pdf) 

/(e,  0>a')(x)  =  (l-e)<f>(x;9,l)  +  e<f>(x;0,a)  (1) 

for  0  <  c  <  1,  where  <f>(x-,0,cr )  is  the  pdf  of  a  normal  distribution  with  mean  0  and 
variance  a2. 

Suppose  that  X\,  X2, . .  • ,  Xn  is  a  random  sample  from  f^£  q  a^{x)  and  that  we  wish  to 
test  the  null  hypothesis  0  =  0  using  the  t-statistic  Tn  =  Xn/Sn,  where  Xn  =  i  £”=1  A,- 
and  S2  =  ^  S"=i  (A;  —  Xn)2.  The  robustness  of  this  t-test  as  measured  by  Pitman 

^Research  partially  supported  by  the  U.  S.  Army  research  office  grant  numbers  1DAAL03-91-G-0179, 
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for  Governmental  purposes  notwithstanding  any  copyright  notation  thereon. 
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efficiency  has  been  studied  in  the  famous  Princeton  study  by  Andrews  et  al.  (1972).  In 
this  paper  we  derive  the  large  deviation  rate  of  Tn  under  P^e  q  ^  which  allows  us  to 
obtain  the  Bahadur  slopes  of  the  2-test  under  a  general  alternative  ^  ay  Following 
the  practice  of  other  authors,  we  set  a  equal  to  3,  and  give  the  Bahadur  slopes  for  various 
values  of  e  and  0  in  Table  1.  This  table  gives  an  indication  of  the  region  of  robustness 
of  the  2-test  as  measured  by  the  Bahadur  slope.  The  robustness  of  the  2-test,  in  the 
sense  of  Bahadur  efficiency,  is  gleamed  by  comparing  the  slope  at  the  contaminated 
distribution  P^e  q  ^  with  the  slope  at  the  uncontaminated  distribution  P^ q  q  A s 

expected,  Table  1  shows  that  there  is  adequate  robustness  in  a  region  of  small  values  of 
€  .  Furthermore,  for  a  fixed  0  the  slope  is  a  decreasing  function  of  e  and  for  a  fixed  e  the 
slope  is  an  increasing  function  of  0. 

The  exact  distribution  of  T2  under  P^e  q  ^  has  been  derived  in  Lee  and  Gurland  (1977). 
We  will  derive  the  large  deviation  rate  of  Tn  under  P^t  q  ^  and  the  Bahadur  slope  under 
the  alternative  P^e  q  ^  in  Section  2. 

2.  Large  DEVIATION  RATES  and  Bahadur  SLOPES.  We  refer  to  the  excellent  mono¬ 
graph  of  Varadhan  (1984)  for  an  introduction  to  the  theory  of  large  deviations  and  to 
the  monograph  of  Bahadur  (1971)  for  the  concept  of  Bahadur  slopes  and  efficiencies. 
One  needs  a  strong  law  under  the  alternative  and  a  large  deviation  result  under  the  null 
hypothesis  to  obtain  Bahadur  slopes.  It  is  easy  to  see  from  the  usual  strong  law  of  large 
numbers  that 

0_ 

1  -  e)  +  e  c 

with  probability  one  under  P^t  q  .  We  need  to  obtain  a  result  of  the  form 

“  loS  P(e,  0,  <j)  ^  m)  “*■  “7(”0,  (3) 

where  7 (m)  is  continuous  in  m.  The  function  7(m)  is  usually  referred  to  as  the  large 
deviation  rate  function.  It  then  follows  that  the  Bahadur  slope  is  given  by 

c(e,0,a)  =  27(m{e,0,<r)).  (4) 

We  now  proceed  with  the  derivation  of  7 (m).  Note  that  the  event  {T%  >  m2}  is  equal  to 
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the  event  {Wn  >  0}  where  W  is  the  quadratic  form  W  =  X'  AX /n  with  A  =  J  —  na I, 
a  =  m2/(l  +  m2),  I  is  the  identity  matrix  and  J  is  a  matrix  of  ones.  Since  the  distribution 
of  Tn  is  symmetric  under  P^e  q  ay  we  have 

P{Tn>m)=l-P(Wn>  0).  (5) 

(From  here  onwards,  P  without  a  suffix  corresponds  to  the  probability  under  q 
The  left  hand  side  of  (5)  can  be  appropriately  approximated  by  using  the  moment  gen¬ 
erating  function  (mgf)  of  Wn  which  is  given  by 

Mn{t)  =  E[exp(tWn)} 


=  £(*)( 


(6) 


where  Ak  —  diag(l, . . . ,  1,  cr2, . . . ,  cr2).  Let  p  =  k/n  and  q  =  1  —  p.  Using  a  matrix 
determinant  formula,  (see  Appendix),  we  can  show  that 


=  \I-"AkA\-^ 
n 


-np/2,  f  /^\\-n9/2  ( +  g  fl  (t )U^L ) \ 


=  ( /i(*rnp/U(* )) 


(7) 


v  /iw/aw  ; 

where  fi(t)  =  1  +  2 at,  f2(t)  =  l  +  2afo-2,  f3(t)  =  l-2f(l-a)  and  /4(i)  =  1 -2tcr2(l -a). 
Thus  the  mgf  of  Wn  is  given  by 


Mn(t)  =  e  ;  t  1  _  efe(n-k)  Mnp(f)  for  f.(p)  <  t  <  i*(p), 


fc=0 


(8) 


where  U(p),  t*(p)  are  the  roots  of  the  quadratic  equation  p/2(f)/3(f)  +  q  —  0. 


From  the  above  formula  for  the  mgf  Mn{t),  we  can  conclude  that  the  distribution 
of  Wn  is  a  mixture  distribution.  More  precisely,  let  K  be  a  binomial  random  variable 
with  parameters  n  and  (1  —  e).  Given  K  =  k,  let  Wn k  be  a  random  variable  with  mgf 
given  by  Mnp,  where  p  =  k/n.  From  (8)  we  can  see  that  Wn  is  equal  in  distribution 
to  WnK.  This  observation  coupled  with  a  theorem  of  Varadhan,  see  Theorem  2.2  in 
Chaganty  (1993),  is  useful  to  derive  the  large  deviation  rate  function  for  the  random 
variable  Wn.  Theorem  1  below  shows  that  the  conditions  in  Varadhan’s  theorem  are 
indeed  satisfied  in  our  problem. 
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THEOREM  1  Let  K  be  a  binomial  random  variable  with  parameters  n  and  (1  —  e).  Given 
K  =  kn  =  npn,  let  Wnkn  be  a  random  variable  with  mgf,  Mnpn(t),  defined  in  (7).  If 
pn  — ►  p  then 

Fn{pn)  =  -  log  P(Wn kn  >  o)  ->  F(p)  as  n-y  oo,  (9) 

n 

where  F(p)  =  [p  log/i(r(p))  +  q  log  f2(t*(p))),  q  =  1  -  p. 

Proof:  Upper  bound:  By  Chebyshev’s  inequality  it  follows  that 

lim  sup  -  log  P(Wnkn  >  0)  <  lim  -  log  Mnhn  (t) 
n  n  n  n 

=  ~\  [p  log/i(0  +  q  log/2(*)]  (10) 

for  any  0  <  t  <  t*(p).  Hence 

lim  sup  Fn(pn)  =  lim  sup  -  log  P(Wnkn  >  0) 

n  n  U 

<  Inf  [p  log  f^t)  +  q  log  f2(t)] 

o<t<t*(p)  Z 

=  F(P)-  (11) 

Lower  bound:  Let  Gnpn  denote  the  distribution  function  of  Wnkn-  Let  us  introduce 
another  random  variable  Vn  with  the  conjugate  distribution  function  given  by 

4G»r,(x)  (12) 

where  tn  =  U(p)(l  —  ^).  Now  for  any  8  >  0  we  have 

P(Wn*„>0)  =  /  dGnpn(x)  =  Mnpn(tn)  f  exp (-xtn)dGntn(x) 

Jo  Jo 

rnS 

>  Mnpn(tn)  I  exp (~xtn)dGntn(x) 

>  Mnpn(tn)  exp (~n8 tn )  P(0  <  Vn  <  n8).  (13) 

Therefore, 

-  log  P{Wnkn  >  0)  >  -  log  MnPn  {tn)  -  8tn  +  —  log  P(0  <  Vn  <  n8).  (14) 

Tt  Tt  Tt 
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Since  pn  — ►  p  as  n  — »•  oo  it  follows  from  (7) 

-  log  Mnpn{tn)  -4  -i  [p  log/i(f*(p))  +  q  log/2(f*(p))]  =  ^(p).  (15) 

n  l 

We  will  now  show  that  the  limiting  distribution  of  Vn/n  is  a  translated  gamma  distri¬ 
bution.  To  find  the  limiting  distribution,  we  first  note  that  the  mgf  of  Vn/n  is  given  by 
Mn(s)  =  Mnpn (sn )  / Mnpn (f n ) ,  where  sn=tn  +  s/n.  It  is  easy  to  check  that 

Mn(s)  — ►  M(s)  =  exp(— sc) 

for  s  <  t*(p),  where  c  =  [ap/(l  +  2 at*(p))  +  aqa2/(  1  +  2 at*(p)<r2)].  Thus  Vn/n  converges 
in  distribution  to  V  —  c,  where  V  is  a  Gamma  random  variable  with  shape  parameter 
1/2  and  scale  parameter  l/t*(p).  Therefore, 

P(0  <  Vn/n  <8)  P{c  <y<c-t-^)>0  as  n  — ►  oo.  (17) 

From  (14),  (15)  and  (17)  we  get 

liminfFn(pn)  =  liminf  -  log  P{Wnpn  >  0) 

n  n  xi 

>  F(P)-8t*(P). 

Since  8  is  arbitrary  we  get  liminf„  Fn(pn)  >  F(p).  This  completes  the  proof  of  the 
theorem. 

We  are  now  in  a  position  to  derive  the  large  deviation  rate  function  for  Tn.  From 
Theorem  1  we  have, 

F.(p.)  =  -  log  P(W.  >  0|JT  =  n p„)  -  F(p).  (18) 

n 

whenever  pn  — *  p.  Note  that 

-  log  P(Wn  >0)  =  -  log  f  exp (nFn(p))  dpn{p)  (19) 

n  n  J 

where  pn  is  the  distribution  of  K/n.  Since  the  distribution  of  K  is  binomial,  it  is  known 
that  the  sequence  of  probability  measures  {pn}  obeys  the  large  deviation  principle  (see 
Varadhan  (1984)  for  the  definition)  with  rate  function 

h(p)  =  plog(p/(l  -  e))  +  qlog(q/e). 


m 

t*{p )  “  ^ 
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Using  the  theorem  of  Varadhan,  see  Theorem  2.2  in  Chaganty  (1993),  and  (18)  and  (19) 
it  follows  that 

-  log  P(Wn  >  0)  -►  sup  (F(p)  -  h(p )).  (20) 

n  o<P<i 

From  (5)  and  (20)  we  get 


-  logP(Tn  >  m)  -7 (m) 
n 

where  7(m)  =  inf0<p<i[-F(p)  +  h(p)]. 

The  rate  function  7(m)  can  easily  be  computed  numerically  using  Newton- Raphson 
method.  In  Table  1  we  present  the  Bahadur  slope,  c(e,  6,  a)  =  2  7 (m(e,  0 ,  <7)),  for  different 
values  of  e  and  0  when  <7  =  3.  Note  that  a  large  value  of  c(e,  a)  indicates  that  the  test 
statistic  Tn  requires  smaller  sample  size  to  detect  that  particular  alternative. 


Table  1.  Slope  of  the  ^statistic  c(e,  0 ,  <7),  for  the  contaminated  normal 
model,  when  <7  =  3. 


e\e 

0.25 

0.50 

1.0 

1.5 

2.0 

2.5 

3.0 

0.00 

0.06066 

0.22314 

0.69314 

1.17866 

1.60944 

2.30258 

0.05 

0.04488 

0.17380 

0.56738 

0.99566 

1.39154 

1.74208 

2.05046 

0.10 

0.03508 

0.14056 

0.48860 

0.87952 

1.24944 

1.58306 

1.88092 

0.15 

0.02866 

0.11598 

0.42936 

0.79694 

1.14852 

1.46908 

1.75732 

0.25 

0.02090 

0.08422 

0.33264 

0.67160 

1.00634 

1.31238 

1.58918 

REMARK  1  It  is  possible  to  derive,  in  a  similar  manner,  the  Bahadur  slope  of  the  t- 
statistic,  for  a  random  sample  of  n  observations  with  common  pdf  given  by  f(x)  = 
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E£=i  ^(Xi  EiL i  TTi  =  1,  and  7Tj  >  0  for  all  L  >  1.  In  this  case  the  multinomial 

distribution  plays  the  role  of  the  binomial  distribution  in  the  derivation  of  the  slope. 
More  generally,  using  the  results  of  Chaganty  (1993),  we  can  also  establish  the  large 
deviation  principle  for  the  i-statistic  for  this  model. 


3. 


APPENDIX.  In  (7)  we  have  used  the  following  determinant  formula.  Let 

k  (n-k) 


bl  +  cJ  cJ 

e  J  d  I  -(-  e  J . 


where  6,  c,  d  and  e  are  constants,  and  as  before,  I  is  the  identity  matrix  and  J  is  the 
matrix  of  ones.  Then  we  can  verify  that 

=  (i  +  t±+(?—3± y  (21) 

To  obtain  the  simplification  in  equation  (7),  we  use  the  above  formula  (21)  with  the 

2 1  c ^ 

substitutions  b  =  /i(t),  d  =  /a(t),  c  = - and  e  = - . 

w  n  n 
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